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Abstract 

Terminological acquisition is an important issue 
in learning for NLP due to the constant termi- 
nological renewal through technological changes. 
Terms play a key role in several NLP-activities 
such as machine translation, automatic indexing 
or text understanding. In opposition to classi- 
cal once-and-for-all approaches, we propose an in- 
cremental process for terminological enrichment 
which operates on existing reference lists and 
large corpora. Candidate terms are acquired by 
extracting variants of reference terms through 
FASTR, a unification-based partial parser. As 
acquisition is performed within specific morpho- 
syntactic contexts (coordinations, insertions or 
permutations of compounds), rich conceptual 
links are learned together with candidate terms. 
A clustering of terms related through coordina- 
tion yields classes of conceptually close terms 
while graphs resulting from insertions denote 
generic/specific relations. A graceful degradation 
of the volume of acquisition on partial initial lists 
confirms the robustness of the method to incom- 
plete data. 



Aims 

Multi-word terms and compounds play an increas- 
ing role in language analysis for the following rea- 
sons : their interpretation is rarely transparent, 
they generally denote a specific class of mental 
or real- world objects and the words composing 
them are strongly related. Therefore, a correct 



*We would like to thank the French scientific doc- 
umentation center INIST/ CNRS for providing us with 
data. All the experiments reported in this paper have 
been performed on [Pascal] a list of 71,623 multi- 
domain terms and [Medic] a 1.56-million word medical 
corpus composed of abstracts of scientific papers owned 
by INIST/CNRS. Many thanks to Jean Royaute of IN- 
IST/CNRS for his helpful and friendly collaboration. 
This work has also benefited from rich discussions in 
the research group Terminologie et Intelligence Artifi- 
cielle of the PRC Intelligence Artificielle. 



processing of terms ensures a higher quality in 
several applications of Natural Language Process- 
ing (NLP). In Machine Translation, their lack of 
transparency makes word-for-word translation fail 
and calls for specific descriptions. In Information 
Retrieval, their high informational content makes 
them good descriptors (Lewis & Croft 199C). In 
parsing, the selectional restrictions found between 
head words and their arguments within a term give 
important clues for stru ctural noun phrase disam- 
biguation ( Resnik 1993| ). 

As terms mirror the concepts of the domain to 
which they belong, a constant knowledge evolu- 
tion leads to a constant term renewal. Thus ter- 
minological acquisition is a necessary companion 
to NLP, specifically when dealing with technical 
texts. 

Tools for terminological acquisition, whether 
statistical, such as (Church fc Hanks 1989| ) , or 
symbolic, such as ( Bourigault 1993j ), acquire terms 
from large corpora through a once-and-for-all pro- 
cess without consideration for any prior termino- 
logical knowledge. This lack of incrcmcntality in 
acquisition has the following drawbacks : 

• Acquired terms must be merged with the initial 
ones with consideration of eventual variants. 

• Acquired terms are neither conceptually nor lin- 
guistically related to the original ones. 

• The set of original terms is ignored although it 
could be a useful source of knowledge for acqui- 
sition. 

It is possible to conceive a finer approach to 
term acquisition by considering the local variants 
of terms within corpora. As term variants gen- 
erally involve more than one term, their extrac- 
tion can fruitfully exploit existing lists of terms in 
a process of non massive incremental acquisition. 



For example, if viral hepatitis is a known term, vi- 
ral and autoimmune hepatitis is a variant of this 
term (a coordination) which displays autoimmune 
hepatitis as a candidate term. Moreover, this coor- 
dination indicates a strong closeness between the 
interpretation of both terms which can be asso- 
ciated to a link within a thesaurus. Henceforth, 
potential terms acquired through acquisition tech- 
niques will be called candidate terms. The decision 
whether to include a candidate term into a termi- 
nology is outside the scope of our work. 

Acquiring with a Concern for Prior 
Knowledge 

Tools for acquiring terms generally operate on 
large corpora using various techniques to detect 
term occurrences. There are mainly two families 
of tools for term acquisition : statistical measures 
and NLP symbolic techniques. 

The first family which comprises most of the 
tools is composed of statistical analyzers which 
have little or no linguistic knowledge. These ap- 
plications take advantage of the specific statistical 
behavior of words composing terms : words which 
are lexically related tend to be found simultane- 
ously more frequently than they would be just by 
chance. Pure statistical methods such as ( |Church 
fc Hanks 19891 ) are rare. Generally some linguis- 
tic knowledge is associated to the statistical mea- 
sures through a prior (Daille 1994) or a posterior 
( [Sniadja 1993| ) filtering of correct syntactic pat- 
terns. The assumption implicitly stated by statis- 
tical works, and which is backed up by our study, 
is that it is more likely to find a term in the neigh- 
borhood of another one than anywhere else in a 
text. More specifically, we assume that the best 
way of combining two terms syntactically and se- 
mantically is to build a specific structure that we 
call a variant which is either a term or a restricted 
noun phrase and which is observed within a small 
span of words. 

The second approach to term acquisition con- 
sists of knowledge-based methods which rely on 
local grammars of noun phrases and compounds 
( Bourigault 1993| ). Word sequences accepted by 
these grammars are extracted through a more or 
less shallow parse of corpora and are good candi- 
date terms. 

The counterpart of both statistical and knowle- 
dge-based acquisitions is to provide the user with 
large lists of candidates which have to be manu- 
ally filtered out. For example, LEXTER (Bouri- 



200,000-word corpus which represent 10,000 can- 
didate terms. It is due to a lack of initial termi- 
nological knowledge along with a lack of consider- 
ation for terminological variation that such meth- 
ods propose too large sets of terms. In order to 
reduce the volume of acquisition and also to pro- 
pose candidates which are more likely to be terms, 
this paper presents a method based on an initial 
list of terms called reference terms. The acquisi- 
tion procedure starts from this supposed compre- 
hensive set of reference terms. It decomposes vari- 
ations of these terms found in corpora and is then 
able to detect candidate terms. 

Updating Rather Than Acquiring 

Is it realistic to suppose that lists of terms exist 
for technical domains ? The ever-growing mass 
of electronic documents calls for tools for access- 
ing these data which have to make extensive use 
of term lists as sources of indexes. For this pur- 
pose, and for other activities related to textual 
databases, more and more thesauri exist. Some of 
them, such as the Unified Medical Language Sys- 
tem meta-thesaurus, carry conceptual and/or lin- 
guistic information about the terms they contain. 
In our experiment we have used the [Pascal] ter- 
minological list composed of 71,623 multi-domain 
terms without conceptual links, provided by the 
documentation center INIST/CNRS. 

Because of the availability of large term lists, 
it is natural to lay a greater stress on the updat- 
ing of such data than on their acquisition from 
scratch. Therefore, our approach to acquisition fo- 
cuses on how to improve a list of terms through 
the observation of a corpus. Our approach also 
differs from previous experiments on term acqui- 
sition because it yields conceptual links between 
candidate and reference terms. It can be used to 
check or to enhance the conceptual knowledge of 
thesauri in a way complementary to automatic se- 
mantic clustering of terms t hrough an observa tion 
of their syntactic contexts (Grefenstette 1994). 



A Micro-syntax for an Accurate 
Extraction 

The first step in our approach to terminologi- 
cal acquisition is the extraction of term variants 
from a large corpus. The tool used is FASTR, 
a unification-based partial parser. FASTR recy- 
cles lists of reference terms by transforming them 
into grammar rules. Then, it dynamically builds 
term variant rules from these term rules. The 



gault 1993) extracts 20,000 occurrences from a parser is described in (Jacqucmin 1994) and, here, 
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it will just be sketched out, by focusing on the 
features that are relevant for terminological acqui- 
sition. More specifically, we will omit the aspects 
of the parser concerning its optimization and the 
feature structures associated with rules and meta- 
rules. 

In such a simplified framework, each r eference 
term corresponds to a PATR-II-like rule (Shiebei 
1986) comprising a context free skeleton and lexi- 
cal items. For example, rule (1) denotes the term 
serum albumin with a (Noun) (Noun) structure : 



Rule iVi -> N 2 N 3 : 

< N2 lemma > = 'serum' 

< N3 lemma > = 'albumin'. 



(1) 



At a higher level, a set of meta-rules operates on 
the term rules and produces new rules describing 
potential variations. Each meta-rule is dedicated 
to a specific term structure and to a specific type 
of variation. For the sake of clarity, meta-rules are 
divided into two sets - meta-rules for two-word 
terms and meta-rules for three-word terms - and 
each set is subdivided into three subsets - meta- 
rules for coordination, insertion and permutation. 
Meta-rules for terms of four words or more are ig- 
nored because they produce very few variants (ap- 
proximately 1 % of the variants). Meta-rule (2) 
applies to rule (1) and yields a new rule (3) : 

Metarule Coor{X x -> X 2 X 3 ) (2) 
= Xi — ► X 2 C4 X 5 X 3 : . 

Rule Ni -> N 2 C* 4 X 5 N 3 : (3) 

< N 2 lemma > = 'serum' 

< Ns lemma > = 'albumin'. 

This transformed rule accepts any sequence serum 
C4 X§ albumin as a variant of serum albumin 
where C4 is any coordinating conjunction and A5 
any single word. For example, it correctly recog- 
nizes serum and egg albumin as a variant of serum 
albumin. The second column of Table [l] presents 
some other meta-rules for two- word terms together 
with examples of pairs composed of a term and 
one of its variants. Currently, the meta-grammar 
of FASTR for English includes 73 meta-rules for 
2- and 3-word terms : 25 coordination meta-rules, 
17 insertion meta-rules and 31 permutation meta- 
rules (plus 66 meta-rules for 4-word terms which 
are not used for acquisition). 

When term variants are described through meta- 
rules as in FASTR, it is very simple to devise a pro- 
cess for term acquisition : each paradigmatic meta- 
rule (or skeleton of a filtering meta-rule) is linked 



to a pattern extractor, yielding a candidate term. 
As no further analysis of the variants is required, 
such an acquisition is extremely fast. The acquisi- 
tion of terms by extracting patterns from variants 
is processed as follows for the different categories 
of variants : 

• Coordination. The candidate term is the term 
coordinated with the original one. 

• Insertion. The candidate term is the term 
which has replaced the head of the original term 
through substitution. 

• Permutation. In a permutation of a 2-word 
term, the argument of the original term is shifted 
from the left of the head to its right and is trans- 
formed into a prepositional phrase. The candi- 
date term is the noun phrase inside this prepo- 
sitional phrase. This definition is extended to 
terms of 3 words or more where one of the argu- 
ments is permuted. 

The third column of Table [I] exemplifies patterns 
of acquisition for each of the three categories of 
term variants. 

This method for term acquisition does not sys- 
tematically succeed for each encountered term 
variant. Some correct variants involve only one 
term instead of two or more and cannot produce 
new candidates. For example, cells and their sub- 
populations is a coordination variant of cell sub- 
population which is unproductive compared with 
the variant exemplified for coordination in Table [|. 
Moreover, terms acquired through a variation may 
already be reference terms (see the non-underlined 
candidates in Tables |[ || and |J) . For the reference 
list to be sufficiently comprehensive, it is expected 
and even desirable that some of the acquired terms 
are already known. Moreover, "acquisitions" of 
known terms are not useless because they reveal 
conceptual links between these terms. 

Acquiring Conceptual Classes 

Tables |, | and | exemplify some terms acquired 
through the three main kinds of variations ob- 
served for English : coordinations, insertions and 
permutations. The terms acquired through permu- 
tations are not conceptually related to the original 
ones due to the syntagmatic nature of this trans- 
formation. On the contrary, coordination and in- 
sertion variations relate semantically close terms. 
We examine in turn the decomposition of these two 
kinds of variations in the aim of acquiring concep- 
tual links. 



3 





Meta-rule and associated variant 


Acquisition 


Coordination 
(25 meta-rules) 


A2 A3 1— > A2 A4 O5 A3 

surgical closure 

1— > surgical exploration and closure 


A^2 A4 

surgical exploration 


Insertion 

(17 meta-rules) 


A2 A3 1— ► A2 A4 A3 

medullary carcinoma 

i— > medullary thyroid carcinoma 


X4 X 3 

thyroid carcinoma 


Permutation 
(31 meta-rules) 


X 2 X 3 i-> A 3 P4 X 5 X2 
control center 

1 ► center for disease control 


X5 X2 

disease control 



Table 1: Acquisition through pattern extraction from variants. (Examples are from [Medic].) 



Candidate term 


Reference Term 


abdominal aorta 


Thoracic aorta 


acidic lipid 


Neutral lipid 


active phase 


Latent phase 


adrenal gland 


Thyroid gland 


affective disorder 


Cognitive disorder 


aged animal 


Young animal 


agonist bromocriptine 


Agonist antagonist 


air conduction 


Bone conduction 


amniotic fluid estimation 


Ratio estimation 


aortic arch 


Aortic coarctation 


aortic valve 


Mitral valve 


arterial acid base 


Arterial blood 



Table 2: Examples of term acquisition through co- 
ordination from [Medic]. Terms which do not be- 
long to the reference list are underlined. 



Candidate term 



Reference Term 



abdominal spear injury 
ablating tool 
absorbed dose 
access pressure 
accessory nerve 
acetylcholine receptor 
acetylcholine receptor 
acid analysis 
acid base disorder 
action potential 
action potential 
activity curve 



Penetrating injury 
Cutting tool 
Radiation dose 
Blood pressure 
Spinal nerve 
Muscarinic receptor 
Nicotinic receptor 
Organic analysis 
Metabolic disorder 
Evoked potential 
Membrane potential 
Time curve 



Table 3: Examples of term acquisition through in- 
sertion from [Medic]. Terms which do not belong 
to the reference list are underlined. 



Coordination 

Two terms are coordinated only if they share the 
same semantic scheme. For example, the variant 
surgical exploration and closure (see the first exam- 
ple of Table |l|) indicates that the two terms sur- 
gical exploration and surgical closure are semanti- 
cally close. They both denote a surgical act. This 
fact is interesting because some of the terms with a 
surgical (Noun) structure such as surgical shock do 
not belong to the same conceptual class and could 
not be coordinated with any of the surgical (Noun) 
terms from this class : *a surgical shock and clo- 
sure is incorrect. Thus, when heads are coordi- 
nated (approximately 15% of the coordinations) 
the head nouns of the terms must belong to the 
same semantic class (with respect to their entry 
selected by their argument). On the other hand, 
when arguments are coordinated, they must select 
the same entry of the head noun. For example, 
dorsal spine and cervical spine can be coordinated 
as both being a part of the (nervous) spine but 



neither of them can be coordinated with a hedge- 
hog or a fish spine. Such coordinations are useful 
indicators for the disambiguation of a head word 
by its arguments : 

• For its classification with other related words 
through head coordination. 

• For the definition of its subsenses depending on 
its arguments through argument coordination. 

This kind of fine-grained selectional restriction 
has to be completed with more general information 
on argument structure through long distance de- 
pendencies. Such restrictions can be acquired from 
statistical measures on the results of a shallow syn- 
tactic analysis and semantic tags, whether manu- 
ally assigned ( Basili, Pazicnza, fc Velardi 1993| ) or 
deduced from a thesaurus (Resnik 1993). These 
studies provide more general and systematic re- 
strictions than our approach and are applied to 
disambiguation or parsing tasks. Our acquisition 
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Candidate term 



Reference Term 



[of] accessory cell 
[in] acetabular growth 
[of] activated b cell 
[of] acute phase protein 
[of] adipose tissue 
[in] adult cell 
[in] agarose gel 
[of] airway control 
[of] anaphylatoxin level 
[of] aneuploid tumor cell 
[of] animal tolerance 
[of] arterial pressure 



Cell proliferation 
Growth factor 
Cell differentiation 
Protein synthesis 
Tissue extract 
Cell function 
Gel electrophoresis 
Control method 
Level measurement 
Cell population 
Tolerance limit 
Pressure control 



Table 4: Examples of term acquisition through 
permutation from [Medic]. Terms which do not 
belong to the reference list are underlined. 



is restricted to local selection but takes advantage 
of the pre-existing knowledge embodied in lists of 
reference terms. 

The acquisition from variants, illustrated for one 
step in Table [l], is repeated on candidate terms 
as long as new candidates are discovered. Then 
classes of compatible sense restrictions are built 
from terms related through constructions of coor- 
dination according to the following rule : 

Two terms t and t' are placed in the same class 
if and only if there exists a chain of coordina- 
tion variants from t to t' : a set of n terms 
t± = t, t%, . . . , in— i, t n = t' such that for each 
pair (ti,ti+i)ie{i,2,...,n-i} either U is acquired 
from a coordination variant of fj + i or ti + \ is 
acquired from a coordination variant of U. 

Figure [l] is a planar representation of the graph 
constructed from one of the classes observed in the 
[Medic] corpus. Each arrow from a term t to a 
term t' indicates that t' has been acquired from a 
coordination variant of t. 

Leaving apart the only head coordination in 
the figure that holds between cirrhotic con- 
trol and cirrhotic patient, all the terms have a 
(Modifier) control structure^] and can be coordi- 
nated through a head coordination. Conceptually, 
the terms of Figure [l] are related to a common hy- 



1 Matched control is a partial term with a missing 
noun argument which is not ruled out by our acqui- 
sition process. With a proper acquisition, this term 
would not appear as a candidate and the links issu- 
ing from this term would issue from one of the correct 
terms (Noun) matched control. 



pernym whose linguistic utterance is medical con- 
trol. 

Moreover, the spatial organization of the graph 
outlines the central role played by normal control 
and disease control. These two terms are the most 
generic ones. Their root position in this acyclic 
graph (except for the two symmetric links) mirrors 
the linguistic fact that an argument coordination 
between two terms tends to place first the most 
generic argument and then the most specific one. 
Thus, although placed at a similar conceptual level 
in the taxonomy, these terms are ordered from the 
most generic to the most specific along the coor- 
dination links. This two-level observation reveals 
that linguistic clues, when precisely observed, are 
good indications of the conceptual organization. 

Insertion 

The meta-rules accounting for insertions insert one 
or more words inside a term string. The following 
meta-rule (4) denotes an insertion of one word in- 
side a two- word term : 



Metarule Ins{X x -> X 2 X 3 ) 

= X± — > X2 X4 X3 



(4) 



The resulting structure is ambiguous depending on 
whether the leftmost word of the term is still an 
argument of the head noun in the variation (e.g. 
[inflammatory [bowel disease]]) or an argument of 
the inserted word (e.g. [[sunflower seed] oil]). The 
second structure is quite rare and does not corre- 
spond to a genuine variant of the original term be- 
cause it has a different argument structure. How- 
ever, most of these possibly incorrect variants are 
correct. It happens every time when the reference 
term (here sunflower oil) corresponds to an elided 
denomination of the variant which is in fact the 
reference term. In this case, the non-ambiguity of 
the elided form relies on pragmatic knowledge, be- 
cause everyone knows that the seed is the part of 
the sunflower used to make sunflower oil. 

Whatever the structure of the variant, either 
((X 3 X 2 ) Xt) or (X 3 (X 2 Xi)), the extraction of 
the sequence X 2 X\ as candidate term (see Ta- 
bles [l] and ||) yields a correct term. When ex- 
tracted from the latter structure, the candidate 
term is more specific than the original one because 
modifiers in the noun phrase tend to be ordered 
from the most generic to the most specific. 

As stated for coordination, iteration of acquisi- 
tion on candidates terms yields conceptual classes. 
However, the construction of the graph linking 
terms acquired through insertion is not as straight- 
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Normal control 



Nondemanded control 
t 

Disease control 



Cirrhotic 
control 

I 

Cirrhotic 
patient 



Uraemic 
control 

Solvent control 



Young 
healthy 
control 



Gender 
matched 
control 



Race 
matched 
control 




Sex matched control 



Age matched healthy control 



Figure 1: Network of coordination links from [Medic]. 



forward as it is for coordination. The reason 
is that one must first conflate conceptually close 
terms that are likely to be coordinated before 
constructing the hierarchy resulting from inser- 
tion variants. Figure [2] has been constructed 
by grouping together malignant tumor/benign tu- 
mor, metastatic tumor/primary tumor and human 
tumor/experimental tumor which have been ob- 
served in coordinated constructions. A further 
grouping of rat tumor with human tumor was nec- 
essary but was not indicated by a coordination in 
our corpus. Similarly, a general category of {Part 
of body) tumors has been created although only 
some coordinations were observed among the pos- 
sible ones : mammary/skin, mammary /pancreas, 
cutaneous /corneal, liver/lung, bone/soft tissue. . . 

Due to the parallel between insertion construc- 
tions and generic/specific links, there is a good 
similarity between the observed graph and the 
taxonomy of this part of the terminology. An 
exception to this rule is the link from (Part of 
body) tumor to malignant tumor coming from the 
variant ovarian malignant tumor. It is indeed an 
exceptional link : there are fifteen different links 
from malignant tumor to more specific terms but 
only one link from a more specific term (ovarian 
tumor) to malignant tumor. 



Incrementality and Robustness 

As introduced for the construction of conceptual 
classes, the acquisition method is repeated incre- 
mentally. Candidates are acquired from candi- 
dates of the preceding step until no new term is 
discovered : 

A term is a candidate if and only if there exists 
a chain of couples (tj, ii+i)ie{i,2,...,n-i} where 
ti+i is acquired from a variant of t% and where 
t\ is a reference term. That is to say that the 
set of candidates is the closure of the set of 
reference terms through the relation of acqui- 
sition. 

Due to the finite corpus, due to the finite length 
of terms and due to the non circularity of the def- 
inition, the incremental acquisition reaches a fixed 
point after a finite number of iterations. It takes 
fifteen cycles to complete an acquisition of 5,080 
terms when starting from the 71,623 terms of the 
[Pascal] list 0. 

Table [5] shows five sequences of acquisition ob- 
tained from term variants in [Medic] starting from 

2 Among these 71,623 terms, only 12,717 are found 
in the [Medic] corpus under their basic form or one of 
its correct variants. 



G 



Metastatic /primary tumor 




(Part of body) (skin/brain/lung. . . ) tumor 




Neuroendocrine Embryonal Malignant Lymphoma 
tumor tumor tumor tumor 



Figure 2: Network of insertion links from [Medic]. 



a reference term in [Pascal]. For example, the 
first sequence indicates the acquisition of tumour 
tissue from tissue extract through a permutation 
variant [extract of tumour tissue) followed by the 
acquisition of normal tissue from a coordination 
(tumour or normal tissue), and so on. This se- 
quence mixes the three kinds of variations while 
the last three are restricted to insertions and/or 
coordinations. When not using permutation, the 
acquisition process yields smaller sets of terms : 
it produces 2,998 terms in fourteen steps through 
coordinations and insertions, 2,193 terms in seven 
steps through insertions and 357 terms in six steps 
through coordinations. The sets obtained without 
the use of permutation are "better" candidates be- 
cause they are produced by transformations which 
yield compounds. Permutations, which transform 
compounds into syntactic noun phrases, tend to 
produce candidates of a lower quality. 

As our method is based on the observation of 
rare occurrences, the number of acquired terms 
depends on the set of reference terms. As in- 



dicated in (Enguehard 1994), such a correlation 
does not exist in her statistical approach to term 
acquisition because she observes larger sets of 
(co-)occurrences. Figure || exemplifies acquisition 



curves for different values of the volume of refer- 
ence terms. It shows that the size of the acquisition 
gradually degrades when the size of the bootstrap 
decreases : 5,080 terms are acquired when starting 
from the total list of 12,717 terms, 3,833 terms are 
still acquired from a bootstrap of 6,000 terms and 
2,329 terms from a bootstrap of 1,000 terms. Thus, 
with only a twelfth of the initial bootstrap, almost 
half the terms are still acquired. Although a se- 
rious degradation of the results is observed under 
this lower limit, these values suggest that acquisi- 
tion depends more on the size of the corpus than 
on the initial terminology As a partial initial list 
of terms is easily compensated by a larger corpus, 
the completeness of the reference list is not a cru- 
cial issue for the quality of the acquisition in our 
framework. 

Conclusion and Future Work 

This study has proposed a novel approach to ter- 
minological acquisition that differs from the two 
main trends in this domain : morpho-syntactic fil- 
tering or statistical extraction. The main feature 
of our approach is accounting for existing lists of 
terms by observing their variants and yielding con- 
ceptual links as well as candidate terms. As long 
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Var 


Acquired terms 


P 


Tissue extract 


tumour tissue 


C-I 


normal tissue 


rat tissue 


P-I 


sprague dawley rat 


female rat 


I-I 


f3U rat 


strain rat 


P-I 


milan strain 


normotens. strain 


T 

1 


control strain 




I 


Blood cell 


leukemic cell 


C-C 


normal cell 


cf cell 


I-I 


pancreatic cell 


beta cell 


/ 1 T 

U-l 


alpha cell 


i- , 1 ~\T T f 11 

activated NK cell 


I 


Cell line 


tumor line 


I-I 


derived cell line 


t cell line 


T T 
1-1 


leukemia cell line 


u93 7 cell line 


T 
1 


histiocytic cell line 




C 


Experimental study 


clinical study 


C-C 


echocardiogr. study 


doppler study 


c 


angiography study 




c 


Pigment, disorder 


nail disorder 


C-C 


nail change 


palmar change 



# acquired terms 
5.000 



Table 5: Examples of sequences of acquisition. 



as they are accessible through morpho-syntactic 
dependencies in a corpus, these links can be used 
to automatically construct parts of the taxonomy 
representing the knowledge in this domain. Among 
the applications of this method are lexical acqui- 
sition, thesaurus discovery and technological sur- 
vey. More generally, terminological enrichment is 
necessary for NLP activities dealing with techni- 
cal sublanguages because their efficiency and their 
quality depend on the completeness of their lexi- 
cons of terms and compounds. 
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Figure 3: Acquisition volumes for different sizes of 
bootstrap on [Medic] corpus. 
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